Development and validation of MRI-based radiomics signatures as new markers for preoperative assessment of EGFR mutation and subtypes from bone metastases

Background This study aimed to develop and externally validate contrast-enhanced (CE) T1-weighted MRI-based radiomics for the identification of epidermal growth factor receptor (EGFR) mutation, exon-19 deletion and exon-21 L858R mutation from MR imaging of spinal bone metastasis from primary lung adenocarcinoma. Methods A total of 159 patients from our hospital between January 2017 and September 2021 formed a primary set, and 24 patients from another center between January 2017 and October 2021 formed an independent validation set. Radiomics features were extracted from the CET1 MRI using the Pyradiomics method. The least absolute shrinkage and selection operator (LASSO) regression was applied for selecting the most predictive features. Radiomics signatures (RSs) were developed based on the primary training set to predict EGFR mutations and differentiate between exon-19 deletion and exon-21 L858R. The RSs were validated on the internal and external validation sets using the Receiver Operating Characteristic (ROC) curve analysis. Results Eight, three, and five most predictive features were selected to build RS-EGFR, RS-19, and RS-21 for predicting EGFR mutation, exon-19 deletion and exon-21 L858R, respectively. The RSs generated favorable prediction efficacies for the primary (AUCs, RS-EGFR vs. RS-19 vs. RS-21, 0.851 vs. 0.816 vs. 0.814) and external validation (AUCs, RS-EGFR vs. RS-19 vs. RS-21, 0.807 vs. 0.742 vs. 0.792) sets. Conclusions Radiomics features from the CE MRI could be used to detect the EGFR mutation, increasing the certainty of identifying exon-19 deletion and exon-21 L858R mutations based on spinal metastasis MR imaging. Supplementary Information The online version contains supplementary material available at 10.1186/s12885-022-09985-4.


Introduction
Lung cancer is the most frequently diagnosed cancer worldwide and continues to increase in both incidence and mortality [1][2][3]. Non-small-cell lung cancer (NSCLC) represents approximately 85% of all lung cancer cases, of which lung adenocarcinoma (LUAD) is the most common histologic subtype [4]. Identification of epidermal growth factor receptor (EGFR) mutations has important therapeutic implications in LUAD [5] because tyrosine kinase inhibitors (TKIs) have been effective for LUAD with EGFR mutations [6]. Patients with EGFR mutations are more sensitive to EGFR-TKIs than those with EGFR wild-type [7]. EGFR mutations mostly occur in exons 18, 19, 20, and 21, among which Open Access † Ying Fan, Yue Dong and Xinyan Sun contributed equally to this work. *Correspondence: xrjiang@cmu.edu.cn 19 deletions and 21 L858R were the most common subtypes [8,9] and accounted for approximately 90% of EGFR mutation cases [10]. Patients who carry the EGFR mutation in exon 19 or 21 often have a higher radiographic response rate to EGFR-TKIs [11], and higher response rate to afatinib, erlotinib, and gefitinib [11], resulting in longer survival time [12,13]. Therefore, early detection of EGFR mutations and subtypes is of great significance in therapeutic decisions.
Bone metastasis often occurs in LUAD, and the prevalence rate is approximately 30-40% [14]. In clinical practice, if tissue biopsies of primary LUAD are impossible to perform, a spinal metastatic lesion can be an important alternative for assessing EGFR mutation status [15]. However, biopsy of spinal metastases may damage the nerve fibers in the spinal cord and increases the risk of metastases [16,17]. Magnetic resonance imaging (MRI) is a noninvasive method that allows direct visualization of bone marrow abnormalities [18]. Contrast-enhanced (CE) MRI can clearly display enhanced regions within the tumor, distinguishing necrosis from solid tumors [19]. A previous study showed that CE MRI is sensitive for the diagnosis of small lesions of bone metastases [20]. Although MRI is a powerful diagnostic technique for NSCLC [21], a biomarker has not been created to detect the EGFR mutation in the spinal metastasis by visual examination of the MRI image.
Radiomics refers to the quantitative analysis of medical imaging, with the capability of obtaining valuable information from imaging data that can be applied within clinical decision support systems for developing diagnostic and predictive models [22,23]. Many radiomic approaches have been used to detect EGFR mutations in NSCLC. However, the majority of previous studies have focused on EGFR mutations in primary lung cancer [24][25][26][27][28]. Recent efforts have evaluated the associations between radiomics features derived from metastatic lesions in the brain and EGFR mutation status [29][30][31][32]. Recent reports have revealed that MRI features of bone metastasis are also related to EGFR mutation status [33,34]. However, these studies only assessed non-CE MR data based on limited sample sizes and lack of external samples to verify their findings, which is inherently limiting. To the best of our knowledge, the relationship between CE MRI of bone metastasis and EGFR mutation status has not yet been clarified. Therefore, this study aimed to explore the value of CE MRI-based radiomics in an attempt to identify EGFR mutations and subtypes based on spinal metastasis.

Patients
This retrospective study was approved by the Medical Ethics Committee of Liaoning Cancer Hospital and Institute, and the requirement for informed consent was waived. A primary set of 159 patients was enrolled from Liaoning Cancer Hospital and Institute between Jan. 2017 and Sep.2021. An external set was established with 24 patients from Shengjing Hospital between Jan. 2017 and Sep.2021. All patients were pathologically diagnosed with spinal metastases from primary lung adenocarcinoma. The inclusion criteria were as follows: (i) age > 18; (ii) CET1 MRI scans before surgery; and (iii) complete clinical data. The exclusion criteria were as follows: (i) other malignant tumor diseases; (ii) radiochemotherapy or treatment with phosphate-containing drugs; and (iii) vertebral compressed fractures. The primary set was divided into a training set and an internal validation set at a 2:1 ratio by stratified sampling. The external set was used for independent validation. Clinical data for patients including age, sex, smoking status, performance status, carcinoembryonic antigen (CEA), cytokeratin (CYFRA), and neuron-specific enolase (NSE) were collected from the hospital's medical system. Figure 1 shows the patient recruitment process.

MRI acquisition and metastasis delineation
MRI scans in primary and external cohorts were performed using the 3.0 T scanner (Siemens Magnetom Trio, Erlangen, Germany). The contrast-enhanced T1-weighted MRI parameters were as follows: echo time (TE) = 9.0 ms, repetition time (TR) = 550 ms, slice thickness = 4 mm, scan interval = 4.4 mm, field of view = 640 × 640 mm and matrix size = 256 × 256. The contrast agent (Gd-DTPA-MBA, Omniscan, GE Healthcare) was injected intravenously at a dose of 0.2 mmol/kg. Subsequently, 20 mL of saline was flushed at a rate of 2.0 mL/s. Spinal metastases were manually segmented by a radiologist (Xinyan Sun) with three years of experience to generate regions of interest (ROIs) along the border of the metastatic lesions in the CET1 MR image. The segmentations were crosschecked by a senior radiologist (Yue Dong) with 17 years of experience. The delineated ROIs were stored in a NII format.

Feature extraction
The delineated ROIs on the MRI slices were used to extract radiomics features using the pyradiomics package in Python version 3.6. A set of 1967 features were extracted, which consisted of first-order statistical, shape-based and textural feature families. Various filters including wavelet, square, squareroot, gradient, exponential, logarithmic, local binary pattern, and Laplacian of Gaussian were used to transform the original MR images. Then, first-order statistical and textural features were extracted from the transformed images to obtain the filtered features. Detailed protocols can be found in a previous study [35] and the pyradiomics document, which is available at https:// pyrad iomics. readt hedocs. io/ en/ latest/.

Feature selection and radiomics model construction
To estimate stability and reproducibility of all extracted features, intraclass correlation coefficient (ICC) analysis was performed to access interobserver agreements of the features [36]. Thirty patients were randomly invited to perform ICC analysis, 15 with EGFR wild-type and 15 with EGFR mutation. The cutoff value was set as 0.85, and features with ICC > 0.85 were retained. Subsequently, these candidate features were further evaluated using the Mann-Whitey U test. Features with P-value < 0.05 were retained, then further selected by least absolute shrinkage and selection operator (LASSO) regression. The optimal lambda was selected with 5-fold cross-validation. By integrating the selected imaging features with the corresponding non-zero LASSO coefficient, radiomics signatures (RSs) were developed to predict EGFR mutations and subtypes (RS-EGFR, RS-19, and RS-21) by logistic regression using the "glmnet" package in R version 3.6.
All models were constructed on the primary training set and tested on both the internal and external validation sets.

Statistical analysis
All statistical analyses were performed using R and Med-Calc 20.0.14 (MedCalc Inc.). A two-sided P-value < 0.05 was considered significant. Statistical differences in distributions between patients and variables were evaluated using the t-test, Mann-Whitney U test, and chi-square test, as appropriate. The Youden index [37] was used to determine the optimal cutoff values in the ROC analysis. The area under the ROC curve (AUC), accuracy, specificity (true negative rate), and sensitivity (true positive rate) were calculated to assess the prediction capabilities. Figure 2 shows the workflow of this study.

Patients' characteristics
The clinical characteristics of the patients are summarized in Table 1. We included 62 patients (33.9%) with wild-type EGFR and 121 patients (66.1%) with EGFR mutations.

Radiomics feature selection
To predict EGFR mutations, we selected eight features as the most important predictors. All features belonged to the textural feature family and showed good prediction performance in terms of AUCs. The features showed statistically significant differences (P < 0.05) in predicting EGFR mutations. To predict exon-19 deletion and exon-21 L858R mutations, three and five most important features were selected, respectively. All features generated an acceptable predictive performance, with AUCs ranging from 0.621 to 0.723. Table 2 lists the prediction performance of the selected features. Figure 3 shows the correlations among the selected features.

Radiomics signature construction and validation
We used the most important selected features to incorporate their nonzero coefficients to build radiomic signatures (RSs  Figure 4 indicates that our RSs can effectively differentiate between patients with wildtype EGFR, EGFR mutations, exon-19 deletion and exon-21 L858R mutation.

Discussion
Noninvasive evaluation of EGFR mutation status and subtypes based on bone metastasis is of great clinical significance, yet it has not been well studied. Previous  studies related to our study mainly focused on primary lesions of lung adenocarcinoma and highlighted that radiomics can be helpful in predicting EGFR mutations and subtypes mainly from medical imaging [10,21,26,38]. Recently, CE MRI has been shown to provide information about the vascularity of the tissue and its surrounding environment [39]. A previous report indicated that CE MRI can reflect an increase in blood vessels of the tumor in the paraspinal muscles and spinal canal, thereby assessing the spread of the tumor outside the skeletal system [40]. To the best of our knowledge, this study is the first attempt to investigate CE MRI-based radiomics for the assessment of EGFR mutation status in bone metastases. Eight features were identified as the most predictive for EGFR mutations, all of which were textural features. This may suggest that the intratumoral heterogeneity within the spinal metastasis was related to the EGFR mutation status, considering that textural features can reflect intratumoral non-uniformity [41]. This was partly consistent with Lindberg's study, which demonstrated the relationship between EGFR mutations and heterogeneity and aggressiveness of lung cancer [42]. To predict exon-19 deletion and exon-21 L858R, we identified three and five most important features from the CET1 MRI, respectively. Notably, most of the selected features (four of five) for predicting exon-21 L858R belong to the textural feature class. This may indicate that heterogeneity distributions within the metastasis were different between tumor carrying exon-19 deletion and exon-21 L858R mutations. The selected original_shape_elongation feature describes the ROI shape. A higher value of this feature indicates a rounder tumor shape. Our findings suggest that tumors with exon-19 deletion tend to be rounder compared with that carrying exon-21 L858R or EGFR wild-type.
Before this study, Jiang et al. reported that nonenhanced MRI-based radiomics can be used to assess the EGFR mutation status in bone metastases but failed to analyze the specific mutation sites [33]. Previous investigations on the assessment of exon-19 deletion and exon-21 L858R are limited. Li et al. [10] generated an AUC of 0.79 for detecting exon-19 deletion and exon-21 L858R using radiomics based on primary lung adenocarcinoma. Cao et al. [43] yielded an AUC of 0.901 for detecting exon-19 deletion and exon-21 L858R based on metastasis from the original lung adenocarcinoma. However, the small sample size, single-center data and lack of external validation all results in low clinical values of their findings. This study first predicted the EGFR mutation and then further assessed whether the mutation was located in exon 19 or 21. Our results revealed that features from CE MRI have good potential to detect EGFR mutations and subtypes.
Furthermore, we sought to explore the association between clinical factors and EGFR mutation status, and found that no clinical factor was associated with the EGFR mutation status and mutation subtypes in both primary and external cohorts. This was inconsistent with previous reports that indicated that age and smoking were highly correlated with EGFR mutations [26,40,44] and subtypes [10]. CEA level and sex were also previously suggested as independent predictors of EGFR mutation status [25,45]. However, there were some studies supported our findings. Ren et al. [34], Kim et al. [39] and Zhang et al. [44] also found that age, smoking, CEA level and sex were not correlated with the EGFR mutation status in their datasets.

Limitation
This study had some limitations. First, although the findings were externally validated on an independent set, the sample size was small because of data collection challenges. The most important features identified on CET1 MRI will need to be verified on a large scale. Second, some important MRI sequences (e.g., T1-weighted, T2-weighted, and T2-weighted fat-suppressed MRI) were not included. Lastly, this study only evaluated the EGFR mutation status before treatment. Assessment of resistant EGFR-T790M mutations in EGFR gene is also important in clinical practice to improve individual treatment management.

Conclusion
In conclusion, this study assessed CET1 MRI-based radiomics for predicting EGFR mutation and subtypes based on the bone metastasis. The developed models performed well on the external set, which may indicate good potential in future clinical applications.